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Abstract. We investigate a one-parameter family of probability densities (related to the 
Pareto distribution, which describes many natural phenomena) where the Cramer-Rao inequal- 
ity provides no information. 



1. Cramer- Rao Inequality 

One of the most important problems in statistics is estimating a population parameter from a 
finite sample. As there arc often many different estimators, it is desirable to be able to compare 
them and say in what sense one estimator is better than another. One common approach is to 
take the unbiased estimator with smaller variance. For example, if X±, . . . ,X n are independent 
random variables uniformly distributed on [0, 9], Y n = max^ Xi and X = (X\ + ■ • • + X n )/n, then 
^^-Y n and 2X are both unbiased estimators of 9 but the former has smaller variance than the 
latter and therefore provides a tighter estimate. 

Two natural questions are (1) which estimator has the minimum variance, and (2) what bounds 
are available on the variance of an unbiased estimator? The first question is very hard to solve 
in general. Progress towards its solution is given by the Cramer-Rao inequality, which provides 
a lower bound for the variance of an unbiased estimator (and thus if we find an estimator that 
achieves this, we can conclude that we have a minimum variance unbiased estimator). 
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Cramer- Rao Inequality: Let f(x] 9) be a probability density function with continuous parameter 
9. Let X\, . . . , X n be independent random variables with density f(x; 9), and let Q(Xi, . . . , X n ) 
be an unbiased estimator of 9. Assume that f(x;9) satisfies two conditions: 

(1) we have 



0_ 

89 



0(xi, . . . ,x n 



B(xi, . . . ,x n j — dx! ■ ■ -dx n ; 



09 



(1.1) 



(2) for each 9, the variance of 0(Xi, . . . , X n ) is finite. 
Then 



1 



var(9) > 



nE 



(1.2) 



where E denotes the expected value with respect to the probability density function f(x; 



For a proof, see for example [CaBej . The expected value in (|1.2[) is called the information 
number or the Fisher information of the sample. 

As variances are non-negative, the Cramer- Rao inequality (equation (|1.2p ) provides no useful 
bounds on the variance of an unbiased estimator if the information is infinite, as in this case we 
obtain the trivial bound that the variance is greater than or equal to zero. We find a simple 
one-parameter family of probability density functions (related to the Pareto distribution) that 
satisfy the conditions of the Cramer-Rao inequality, but the expectation (i.e., the information) is 
infinite. Explicitly, our main result is 



Theorem: Let 



ag x log x if x > e 
/(,-:«) { (1.3) 

otherwise, 
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where ag is chosen so that f(x; 9) is a probability density function. The information is infinite 
when 9 = 1. Equivalently, the Cramer-Rao inequality yields the trivial (and useless) bound that 
Var(O) > for any unbiased estimator of 9 when 9 = 1. 



In [J2] we analyze the density in our theorem in great detail, deriving needed results about ag 
and its derivatives as well as discussing how f(x] 9) is related to important distributions used to 
model many natural phenomena. We show the information is infinite when 9 = 1 in which 
proves our theorem. We also discuss there properties of estimators for 9. While it is not clear 
whether or not this distribution has an unbiased estimator, there is (at least for 9 close to 1) an 
asymptotically unbiased estimator rapidly converging to 9 as the sample size tends to infinity. By 
examining the proof of the Cramer-Rao inequality we see that we may weaken the assumption of 
an unbiased estimator. While typically there is a cost in such a generalization, as our information 
is infinite there is no cost in our case. We may therefore conclude that arguments such as those 
used to prove the Cramer- Rao inequality cannot provide any information for estimators of 9 from 
this distribution. 



2. An Almost Pareto Density 



Consider 



f(x;9) 



(2.1) 



ae I (x 6 log 3 x) if x > e 
otherwise, 
where ag is chosen so that f(x;9) is a probability density function. Thus 

j e a0 X 6 \og 3 X 1 " 

We chose to have log 3 x in the denominator to ensure that the above integral converges, as does 
log a: times the integrand; however, the expected value (in the expectation in (|f .2[0 will not 
converge. 
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For example, 1/xlogx diverges (its integral looks like log log x) but l/xlog 2 a; converges (its 
integral looks like 1 / log x) ; see pages 62-63 of |Rudj for more on close sequences where one 
converges but the other does not. This distribution is close to the Pareto distribution (or a power 
law) . Pareto distributions are very useful in describing many natural phenomena; see for example 
[DM1 INei INMj . The inclusion of the factor of log -3 x allows us to have the exponent of x in the 
density function equal 1 and have the density function defined for arbitrarily large x; it is also 
needed in order to apply the Dominated Convergence Theorem to justify some of the arguments 
below. If we remove the logarithmic factors then we obtain a probability distribution only if the 
density vanishes for large x. As log 3 a; is a very slowly varying function, our distribution f(x;9) 
may be of use in modeling data from an unbounded distribution where one wants to allow a 
power law with exponent 1, but cannot as the resulting probability integral would diverge. Such 
a situation occurs frequently in the Benford Law literature; see [Hil IRaij for more details. 

We study the variance bounds for unbiased estimators of 9, and in particular we show that 
when 9 = 1 then the Cramer-Rao inequality yields a useless bound. 

Note that it is not uncommon for the variance of an unbiased estimator to depend on the 
value of the parameter being estimated. For example, consider again the uniform distribution on 
[0,8]. Let X denote the sample mean of n independent observations, and Y n = maxi<i<n be 
the largest observation. The expected value of 2X and " !1 ^Yn are both 9 (implying each is an 
unbiased estimator for 9); however, V&r(2X) = 6 2 /3n and Var( 2 ^-iy„) = 8 2 /n(n+l) both depend 
on 9, the parameter being estimated (sec, for example, page 324 of [MMJ for these calculations). 

Lemma 2.1. As a function of 9 £ [1, oo), ag is a strictly increasing function and a\ = 2. It has 
a one-sided derivative at 9 = 1, and ^r£- £ (0, oo). 

Proof. We have 



(2.3) 
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When 9 = 1 we have 



a i 



d.r 



x log 3 x _ 

which is clearly positive and finite. In fact, a\ = 2 because the integral is 



d.r 



a; log x 



_ 3 dlogx 
log X — 

da; 



-1 



(2.4) 



(2.5) 



2\og'x 2' 

e 

though all we need below is that a% is finite and non-zero, we have chosen to start integrating at 
e to make a\ easy to compute. 

It is clear that ag is strictly increasing with 9, as the integral in (|2.4|) is strictly decreasing with 
increasing 9 (because the integrand is decreasing with increasing 8). 

We are left with determining the one-sided derivative of ag at 9 = 1, as the derivative at any 
other point is handled similarly (but with easier convergence arguments). It is technically easier 
to study the derivative of 1/ag, as 

(2.6) 



d 1 

d9 ag 



1 dag 
"aJ~dT 



and 



1 

ag 



d.r 



(2.7) 



ar log x 

The reason we consider the derivative of 1 /ag is that this avoids having to take the derivative of 
the reciprocals of integrals. As a\ is finite and non-zero, it is easy to pass to ^-\g = \. Thus we 
have 



d_ 1_ 

d9 ag 



1=1 



lim — 

/i^o+ h 



= lim 



dx 



x l+h log 3 x 



d.r 



a: log x 



\-x h 1 



dx 



(2.8) 



h x h x log 3 x 

We want to interchange the integration with respect to x and the limit with respect to h above. 
This interchange is permissible by the Dominated Convergence Theorem (see Appendix [X] for 
details of the justification). Note 

1 - x h 1 



lim 

h->0+ 



log a;; 



(2.9) 
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one way to see this is to use the limit of a product is the product of the limits, and then use 
L'Hospital's rule, writing x as e hlogx . Therefore 



_d l_ 

d9 ag 



dx 



x log X 



(2.10) 



as this is finite and non-zero, this completes the proof and shows ^nf|e=i S (0,oo) 



□ 



Remark 2.2. We see now why we chose f(x; 9) — ag/x 9 log 3 x instead of f(x; 9) = ag/x e log 2 x. 
If we only had two factors of log a; in the denominator, then the one-sided derivative of ag at 9 = 1 
would be infinite. 

Remark 2.3. Though the actual value of does not matter, we can compute it quite 

easily. By (|2. 10|) we have 



_d 1_ 

d9 ag 



dx 



x log x 



'. _ 2 dlogx 
log X 



dx 



log a; 



= -1. 



(2.11) 



Thus by (|2.6p . and the fact that a\ = 2 (Lemma |2.1[) . we have 



dag 
d9 



2 d 1 
=i ~~ ~ ai ' d9~a~g 



(2.12) 



3. Computing the Information 

2" 



We now compute the expected value, E 
completes the proof of our main result. Note 



( dlog f(x;8) V 
V 96 ) 



showing it is infinite when 9 = 1 



log/(x;0) = log ag — 9 log x + log log 3 x 

d\ogf(x;9) 1 dag 

69 ag d9 



(3.1) 
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By Lemma |2~T1 we know that is finite for each 6 > 1. Thus 



aiog/(s;fl) 
66 



= E 



1 dag 
~ e ~d0 

1 da» 



dx 



logx -ae- 3 • 
ag dd J x e log x 



(3.2) 



If 9 > 1 then the expectation is finite and non-zero. We are left with the interesting case when 

x%, though 



1. As ^| e =i is finite and non-zero, for x sufficiently large (say x > x\ for some 



de 



by Remark 12.31 we see that we may take any x\ > e 4 ) we have 



1 dag 




'a 1 ~dB 


8=1 



< 



logx 



(3.3) 



As ai = 2, we have 



E 



dlog f{x;t 
89 



> 



log x \ da; 
da; 



a; log a; 



X1 2a; log x 
2 7 X1 da; 

I 00 

= -log log a; 

Z x\ 

= oo. (3.4) 

Thus the expectation is infinite. Let be any unbiased estimator of 9. If 9 = 1 then the 
Cramer-Rao inequality gives 

var(9) > 0, (3.5) 

which provides no information as variances are always non-negative. This completes the proof of 
our theorem. □ 
We now discuss estimators for 9 for our distribution f(x; 9). If Xi, . . . , X n are n independent 
random variables with common distribution f(x; 9), then as n — > oo the sample median converges 
to the population median jig (if n = 2m + 1 then the sample median converges to being normally 
distributed with median Jig and variance l/8mf(p,g; 9) 2 ; see for example Theorem 8.17 of [MM] ). 
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3.7 - 



3. 



3.8 



4 . 1 




1.4 



Figure 1. Plot of the median fie of f(x; 9) as a function of 9 (fix — e^ 2 ). 



For 9 close to 1 we see in Figure [T] that the median fig of f(x; 9) is strictly decreasing with 
increasing 9, which implies that there is an inverse function g such that gijie) = 9. We obtain an 
estimator to 9 by applying g to the sample median. This estimator is a consistent estimator (as 
the sample size tends to infinity it will tend to 9) and should be asymptotically unbiased. 
The proof of the Cramer-Rao inequality starts with 



where 0(xi, . . . , x n ) is an unbiased estimator of 9 depending only on the sample values x\, . . . , x n . 
In our case (when each h(x;9) = f(x;9)) we may not have an unbiased estimator. If we denote 
this expectation by J~{9), for our investigations all that we require is that dJ-(9)/d9 is finite (which 
is easy to show). Going through the proof of the Cramer- Rao inequality shows that the effect 
of this is to replace the factor of 1 in (|1.2|) with (1 + dT '(9) / 'd9) 2 ; thus the generalization of the 
Cramer-Rao inequality for our estimator is 



As our variance is infinite for 9 = 1 we see that, no matter what 'nice' estimator we use, we will 
not obtain any useful information from such arguments. 




(3.6) 




(3.7) 
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Appendix A. Applying the Dominated Convergence Theorem 

We justify applying the Dominated Convergence Theorem in the proof of Lemma l2.ll See, for 
example, |SS] for the conditions and a proof of the Dominated Convergence Theorem. 



Lemma A.l. For each fixed h > and any x > e, we have 

l-x h 1 



< e logic, 



(A.l) 



and 



is positive and integrable, and dominates each 



l-x n 1 1 
h x h x log 3 x ' 



Proof. We first prove (|A.1[) . As x > e and h > 0, note x h > 1. Consider the case of 1/h < \ogx. 
Since II - x h \ < 1 + x h < 2x h , we have 



|1 - x h \ ^ 2x h < 2 < 
hx h hx h ~ h ~ ^ 



We are left with the case of 1/h > log a:, or hlogx < 1. We have 



(A.2) 



II -x n 



(hlogx) 



h log x 



(/iloga;)"" 1 



^ (felogx)"- 1 
< h log t T — 7 7^ — = h l°g # ' e 



h log a: 



^ (n-l)! 

n— 1 v ' 

This, combined with Moga; < 1 and x h > 1 yields 



1 1 — ar 1 1 eh log a; 

< ; = elogx. 



(A.3) 



(A.4) 



hx h h 

It is clear that J^Ji x is positive and integrable, and by L'Hospital's rule (see (|2.9[0 we have that 

1 - x h 1 1 1 



lim 



h^h+ h x h slog 3 x a; log 2 a; 

Thus the Dominated Convergence Theorem implies that 

l-x h I Ax f°° dx 



lim 



h x h x loe 



x log x 



= -1 



(A.5) 



(A.6) 
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(the last equality is derived in Remark |273|) . □ 
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